Query Weighting for Ranking Model Adaptation
نویسندگان
چکیده
We propose to directly measure the importance of queries in the source domain to the target domain where no rank labels of documents are available, which is referred to as query weighting. Query weighting is a key step in ranking model adaptation. As the learning object of ranking algorithms is divided by query instances, we argue that it’s more reasonable to conduct importance weighting at query level than document level. We present two query weighting schemes. The first compresses the query into a query feature vector, which aggregates all document instances in the same query, and then conducts query weighting based on the query feature vector. This method can efficiently estimate query importance by compressing query data, but the potential risk is information loss resulted from the compression. The second measures the similarity between the source query and each target query, and then combines these fine-grained similarity values for its importance estimation. Adaptation experiments on LETOR3.0 data set demonstrate that query weighting significantly outperforms document instance weighting methods.
منابع مشابه
مدل جدیدی برای جستجوی عبارت بر اساس کمینه جابهجایی وزندار
Finding high-quality web pages is one of the most important tasks of search engines. The relevance between the documents found and the query searched depends on the user observation and increases the complexity of ranking algorithms. The other issue is that users often explore just the first 10 to 20 results while millions of pages related to a query may exist. So search engines have to use sui...
متن کاملEvaluation of Term Ranking Algorithms for Pseudo-Relevance Feedback in MEDLINE Retrieval
OBJECTIVES The purpose of this study was to investigate the effects of query expansion algorithms for MEDLINE retrieval within a pseudo-relevance feedback framework. METHODS A number of query expansion algorithms were tested using various term ranking formulas, focusing on query expansion based on pseudo-relevance feedback. The OHSUMED test collection, which is a subset of the MEDLINE databas...
متن کاملTREC-8 Automatic Ad-Hoc Experiments at Fondazione Ugo Bordoni
We present further evidence suggesting the feasibilty of using information theoretic query expansion for improving the retrieval effectiveness of automatic document ranking . Compared to our participation in TREC-7, in which we applied this technique to an ineffective initial ranking, here we show that information theoretic query expansion may be effective even when the quality of the first pas...
متن کاملNonlinear ranking function representations in genetic programming-based ranking discovery for personalized search
Ranking function is instrumental in affecting the performance of a search engine. Designing and optimizing a search engine’s ranking function remains a daunting task for computer and information scientists. Recently, genetic programming (GP), a machine learning technique based on evolutionary theory, has shown promise in tackling this very difficult problem. Ranking functions discovered by GP h...
متن کاملThe Role of Multi-word Units in Interactive Information Retrieval
The paper presents several techniques for selecting noun phrases for interactive query expansion following pseudo-relevance feedback and a new phrase search method. A combined syntactico-statistical method was used for the selection of phrases. First, noun phrases were selected using a part-ofspeech tagger and a noun-phrase chunker, and secondly, different statistical measures were applied to s...
متن کامل